Search results for "Sequence classification"

showing 5 items of 5 documents

Classification of Sequences with Deep Artificial Neural Networks: Representation and Architectural Issues

2021

DNA sequences are the basic data type that is processed to perform a generic study of biological data analysis. One key component of the biological analysis is represented by sequence classification, a methodology that is widely used to analyze sequential data of different nature. However, its application to DNA sequences requires a proper representation of such sequences, which is still an open research problem. Machine Learning (ML) methodologies have given a fundamental contribution to the solution of the problem. Among them, recently, also Deep Neural Network (DNN) models have shown strongly encouraging results. In this chapter, we deal with specific classification problems related to t…

SequenceBiological dataSequence classificationSettore INF/01 - InformaticaArtificial neural networkProcess (engineering)Computer sciencebusiness.industryDeep learningBacteria classificationSequence classificationBacteria classificationNucleosome identificationDeep neural networkMachine learningcomputer.software_genreData typeNucleosome identificationComponent (UML)Artificial intelligenceMetagenomicsRepresentation (mathematics)businesscomputer
researchProduct

A Quantum-Inspired Classifier for Early Web Bot Detection

2022

This paper introduces a novel approach, inspired by the principles of Quantum Computing, to address web bot detection in terms of real-time classification of an incoming data stream of HTTP request headers, in order to ensure the shortest decision time with the highest accuracy. The proposed approach exploits the analogy between the intrinsic correlation of two or more particles and the dependence of each HTTP request on the preceding ones. Starting from the a-posteriori probability of each request to belong to a particular class, it is possible to assign a Qubit state representing a combination of the aforementioned probabilities for all available observations of the time series. By levera…

Settore INF/01 - InformaticaComputer Networks and Communicationsbot detectionData modelsTime series analysisearly decisionquantum-inspired computingTime measurementCorrelationCostsmultinomial classificationPredictive modelsbot detection; Correlation; Costs; Data models; early decision; multinomial classification; multivariate sequence classification; Predictive models; quantum-inspired computing; sequential classification; Task analysis; Time measurement; Time series analysis;multivariate sequence classificationTask analysisSafety Risk Reliability and Qualitybot detection; Correlation; Costs; Data models; early decision; multinomial classification; multivariate sequence classification; Predictive models; quantum-inspired computing; sequential classification; Task analysis; Time measurement; Time series analysissequential classification
researchProduct

A new feature selection strategy for K-mers sequence representation

2014

DNA sequence decomposition into k-mers (substrings of length k) and their frequency counting, defines a mapping of a sequence into a numerical space by a numerical feature vector of fixed length. This simple process allows to compute sequence comparison in an alignment free way, using common similarities and distance functions on the numerical codomain of the mapping. The most common used decomposition uses all the substrings of length k making the codomain of exponential dimension. This obviously can affect the time complexity of the similarity computation, and in general of the machine learning algorithm used for the purpose of sequence classification. Moreover, the presence of possible n…

Settore INF/01 - Informaticak-mers DNA sequence similarity feature selection DNA sequence classification
researchProduct

Alignment free Dissimilarities for sequence classification

2015

One way to represent a DNA sequence is to break it down into substrings of length L, called L-tuples, and count the occurence of each L-tuple in the sequence. This representation defines a mapping of a sequence into a numerical space by a numerical feature vector of fixed length, that allows to measure sequence similarity in an alignment free way simply using disssimilarity functions between vectors. This work presents a benchmark study of 4 alignment free disssimilarity functions between sequences, computed on their L-tuples representation, for the purpose of sequence classification. In our experiments, we have tested the classes of geometric-based, correlation-based and information-based …

Settore INF/01 - Informaticak-mers L-tuples DNA sequence similarity DNA sequence classification Knn classifier
researchProduct

A New Feature Selection Methodology for K-mers Representation of DNA Sequences

2015

DNA sequence decomposition into k-mers and their frequency counting, defines a mapping of a sequence into a numerical space by a numerical feature vector of fixed length. This simple process allows to compare sequences in an alignment free way, using common similarities and distance functions on the numerical codomain of the mapping. The most common used decomposition uses all the substrings of a fixed length k making the codomain of exponential dimension. This obviously can affect the time complexity of the similarity computation, and in general of the machine learning algorithm used for the purpose of sequence analysis. Moreover, the presence of possible noisy features can also affect the…

k-mers DNA sequence similarity feature selection DNA sequence classification.Settore INF/01 - InformaticaComputer scienceSequence analysisbusiness.industryFeature vectorPattern recognitionFeature selectionDNA sequencingSubstringExponential functionArtificial intelligencebusinessAlgorithmTime complexity
researchProduct